Smoothed quantile residual life regression analysis with application to the Korea HIV/AIDS cohort study

Background The residual life of a patient with human immunodeficiency virus (HIV) is of major interest to patients and their physicians. While existing analyses of HIV patient survival focus mostly on data collected at baseline, residual life analysis allows for dynamic analysis based on additional data collected over a period of time. As survival times typically exhibit a right-skewed distribution, the median provides a more useful summary of the underlying distribution than the mean. In this paper, we propose an efficient inference procedure that fits a semiparametric quantile regression model assessing the effect of longitudinal biomarkers on the residual life of HIV patients until the development of dyslipidemia, a disease becoming more prevalent among those with HIV. Methods For estimation of model parameters, we propose an induced smoothing method that smooths nonsmooth estimating functions based on check functions. For variance estimation, we propose an efficient resampling-based estimator. The proposed estimators are theoretically justified. Simulation studies are used to evaluate their finite sample performances, including their prediction accuracy. We analyze the Korea HIV/AIDS cohort study data to examine the effects of CD4 (cluster of differentiation 4) cell count on the residual life of HIV patients to the onset of dyslipidemia. Results The proposed estimator is shown to be consistent and normally distributed asymptotically. Under various simulation settings, our estimates are approximately unbiased. Their variances estimates are close to the empirical variances and their computational efficiency is superior to that of the nonsmooth counterparts. Two measures of prediction performance indicate that our method adequately reflects the dynamic character of longitudinal biomarkers and residual life. The analysis of the Korea HIV/AIDS cohort study data shows that CD4 cell count is positively associated with residual life to the onset of dyslipidemia but the effect is not statistically significant. Conclusions Our method enables direct prediction of residual lifetimes with a dynamic feature that accommodates data accumulated at different times. Our estimator significantly improves computational efficiency in variance estimation compared to the existing nonsmooth estimator. Analysis of the HIV/AIDS cohort study data reveals dynamic effects of CD4 cell count on the residual life to the onset of dyslipidemia. Supplementary Information The online version contains supplementary material available at 10.1186/s12874-024-02159-9.


Regularity conditions
We assume the following regularity conditions to establish the asymptotic properties of the proposed induced smoothed estimator.These conditions are similar to those imposed in Li et al.
is uniformly bounded away from 0 and from above.Also, g ′ Ti−tij (s|T i ≥ t ij ) exists and is uniformly bounded on the real line.

Asymptotic properties of γ
The proofs follow from the asymptotic properties of the nonsmooth estimator γ and the arguments used in Kim et al. [2].We first prove the consistency of γ.Under C1-C4, based on the consistency result of γ for γ 0 [1], establishing the following convergence result is sufficient to prove the consistency of γ: As n → ∞ where ϵ n denotes a positive sequence converging to 0. Let , where sgn(•) is the sign function.Then, For a specific t ij ∈ Γ, by Condition C2, we have
To establish the asymptotic normality of n 1/2 (γ − γ 0 ), we show the asymptotic equivalence between n 1/2 (γ −γ 0 ) and n 1/2 (γ −γ 0 ).The consistency and the asymptotic normality of γ, the nonsmoothed method estimator, has been established in Li et al. [1].Thus, combining these results with the convergence result in (4), it suffices to show the following convergence result [2,7]: As n → ∞, for any positive definite matrix H, where ω * is the line segment lying between 0 and σ ij t.By applying Condition C4 (b), the second term on the right-hand side of the equation above can be shown to converge to 0, i.e., ∃M > 0, Now, applying the same arguments used for E ϕ − ϵij σij Consequently, From ( 6) and ( 7), we have the desired result: ∥ Ãn (γ 0 ) − A∥ p − → 0 as n → ∞.
Note that the asymptotic covariance function is the same as that of the nonsmooth estimator.Its explicit form is given as η ij S G i (γ, t j ), Here D k denote the observed data from the kth subject.The Kaplan-Meier estimator Ĝ(•) of G(•) is asymptotically represented as Ĝ(t) − G(t) = n −1 n i=1 g i (t) + o p (n −1/2 ) where the analytic form of g i (t) can be found in Reid [8].

Table S1
Summary of the simulation results under Simulation setup II Table S2 Summary of simulation results using polynomial basis (1, log(t), (t), 1/ (t), 1/t) under Simulation setup II Table S3 Summary of simulation results using polynomial basis (1, 1/ (t), t, t 2 ) under Simulation setup II

Table S4
Table S7 Summary of simulation results under Simulation setup I with censoring rate= 24% Summary of simulation results using a B-spline basis (knot=null) in Simulation setup II Table S6 Summary of simulation results using a B-spline basis (knot = (0.4,0.7)) in Simulation setup II Table S8 Summary of simulation results under Simulation setup II with censoring rate= 28%